MCN: Modulated Convolutional Network

43

g=1

g=10

h=1

h=2

h=20

Input feature maps

10×4×32×32

Output feature maps

20×4×30×30

Reconstructed filters

20×10ˈ4×4×32×32

sum

10

20

Groups

20

10

FIGURE 3.3

MCNs Convolution (MCconv) with multiple feature maps. There are 10 and 20 feature

maps in the input and the output, respectively. The reconstructed filters are divided into

20 groups, and each group contains 10 reconstructed filters, corresponding to the number

of feature maps and MC feature maps, respectively.

map, h = 1, i = 1, ..., 10, g = 1, ..., 10, and for the second output feature map, h = 2, i =

11, ..., 20, g = 1, ..., 10.

When the first convolutional layer is considered, the input size of the network is

32 × 32 2. First, each image channel is copied K = 4 times, resulting in the new input

of size 4 × 32 × 32 to the entire network.

It should be noted that the number of input and output channels in every feature map

is the same, so MCNs can be easily implemented by simply replicating the same MCconv

module at each layer.

3.4.2

Loss Function of MCNs

To constrain CNNs to have binarized weights, we introduce a new loss function in MCNs.

Two aspects are considered: unbinarized convolutional filters are reconstructed based on

binarized filters; the intra-class compactness is incorporated based on output features. We

further introduce the variables used in this section: Cl

i are unbinary filters of the lth con-

volutional layer, l ∈{1, ..., N}; ˆCl

i denote binarized filters corresponding to Cl

i; M l denotes

the modulation filter (M-Filter) shared by all Cl

i in the lth convolutional layer and M l

j

represents the jth plane of M l;is a new plane-based operation (Eq. 3.12) which is defined

in the next section. We then have the first part of the loss function for minimization:

LM = θ

2



i,l

Cl

i ˆCl

i M l2+

λ

2



m

fm( ˆC, M)f( ˆC, M)2,

(3.18)

2We only use one channel of gray-level images (3 × 32 × 32)